Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Dashboard addon to version 1.8.0 and align /ui redirect with it #53046

Merged
merged 1 commit into from
Dec 1, 2017

Conversation

maciaszczykm
Copy link
Member

@maciaszczykm maciaszczykm commented Sep 26, 2017

What this PR does / why we need it: In Dashboard 1.8.0 we have introduced a couple of changes (security, settings, new resources etc.) and fixed a lot of bugs. You can check release notes at https://github.com/kubernetes/dashboard/releases/tag/v1.8.0.

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes #

Special notes for your reviewer:

Release note:

Updated Dashboard add-on to version 1.8.0: The Dashboard add-on now deploys with https enabled. The Dashboard can be accessed via kubectl proxy at http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/. The /ui redirect is deprecated and will be removed in 1.10.

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Sep 26, 2017
@k8s-github-robot k8s-github-robot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Sep 26, 2017
@maciaszczykm
Copy link
Member Author

/assign @lavalamp

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Sep 27, 2017
@maciaszczykm maciaszczykm changed the title Update Dashboard addon to version 1.7.0 Update Dashboard addon to version 1.7.1 Oct 3, 2017
@roberthbailey
Copy link
Contributor

@bryk - can you take a look?

@floreks
Copy link
Member

floreks commented Oct 10, 2017

We are pretty sure that failed tests are related to issue #53382.

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 16, 2017
@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 16, 2017
@maciaszczykm
Copy link
Member Author

@roberthbailey With @floreks we have managed to fix failing tests. Can you take a look?

@maciaszczykm
Copy link
Member Author

The issue mentioned earlier was fixed by a change in our initial container.

@@ -31,12 +36,26 @@ spec:
memory: 100Mi
ports:
- containerPort: 9090
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why didn't this port change if everything is shifting to 8443?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be changed. This option should expose port 9090 of this container, right? Is this overridden by using expose option in dockerfile? This container will actually only expose port 8443.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed it before. Fixed now.

@roberthbailey
Copy link
Contributor

Please squash your commits.

/assign @mikedanese

to look at the rbac changes.

rules:
- apiGroups: [""]
resources: ["secrets"]
verbs: ["create", "watch"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the watch used for? cc @kubernetes/sig-auth-pr-reviews

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not something I'd recommend allowing (this is equivalent exposure to listing all secrets in the namespace). If the dashboard was in its own namespace, this would still not be ideal, but could be more palatable, but in kube-system, it's not a reasonable default policy

Copy link
Member

@floreks floreks Oct 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love to restrict it even further but there is no option to define rule to watch on a single resource changes. Dashboard is actually only watching on single dashboard exclusive resource (secret named kubernetes-dashboard-key-holer). https://github.com/kubernetes/dashboard/blob/master/src/app/backend/sync/secret.go#L169

Since we are not exposing any endpoint that could allow to exploit that, then only stealing token from inside the pod would be an option to somehow exploit that permission.

PS. It is still a huge step forward for us from full cluster admin permissions that were granted previously. We'll update them if at some point it will be able to restrict rule to watch on single resource.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Until individual watch authz is available, you can do individual gets of the secret or mount it into the dashboard pod and react to changes in the mounted content

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with liggitt. Don't use watch, just poll with gets. This is what kubelet does.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use multiple decryption keys but how does this solve the issue of syncing them with all replicas?

Let's consider the use case where Dashboard is behind load balancer and there are more than 1 replica. Let's say that user has logged in and token was encrypted with locally synchronized key of backend-1. Then user gets redirected to backend-2 without knowing that and it might not have this key synchronized yet if we use polling mechanism with i.e. 5 min period. In result user gets forced logged out.

Second use case. We have 1 replica, key is sychronized with a secret. Secret gets deleted manually. In the meantime Dashboard is scaled up to 2 replicas. Second replica can not find secret and generates new encryption key, and stores it in a secret. Because of polling we have now 2 replicas with different keys that will be out of sync for a few minutes.

Currently when secret gets deleted it is immediately recreated based on local copy stored in one of the replicas.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then user gets redirected to backend-2 without knowing that and it might not have this key synchronized yet if we use polling mechanism with i.e. 5 min period. In result user gets forced logged out.

  1. secret contains [key1], all replicas use key1 for encrypting and decrypting
  2. update secret to contain [key1, key2]. as replicas observe the new secret, they use key1 for encrypting and attempt decrypting with key1 and key2
  3. wait at least as long as your secret distribution period, then update the secret to contain [key2, key1]. as replicas observe the new secret, they use key2 for encrypting and attempt decrypting with key2 and key1
  4. wait as long as your cookie expiration period (so cookies created using key1 would no longer be valid), then update the secret to contain [key2]

the wait at step 2 is required to let all replicas observe the new decryption key before starting to use it. alternately, they could react to decryption failures by repolling the secret to see if there is a new key for them to use.

the wait at step 3 is required to avoid logging out users that logged in and have a session that can only be decrypted by the previous key

Copy link
Member

@floreks floreks Oct 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now we do not have key rotation mechanism implemented, however there is a fallback mechanism that forces synchronous update of secret in case decryption fails.

Still with current implementation I think polling would not work and it would have to be extended to support storing multiple keys in a secret (even that would need some rework to work properly).

Case in which this would not work with polling is:

  1. Start with 1 replica, it generates and creates a secret with key-1.
  2. Secret gets deleted.
  3. Scale replicas to 2. New replica creates secret with a new key-2. It does not have information about old key-1.
  4. Request goes to 2nd replica. Token encrypted with key-1 can not be decrypted with new key-2. User is logged out.

If points 2-4 happen during polling interval and new replica won't be able to synchronize both keys then there is a problem. Currently this problem is very unlikely to happen as thanks to watch secret gets immediately recreated from local copy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, how should we proceed to get it merged? Should we implement behaviour described by @liggitt, move it to another namespace (can Dashboard be cluster-service then?) or do @floreks concerns sound reasonable and there is another way to go?

Copy link
Member

@liggitt liggitt Oct 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Case in which this would not work with polling is:

  1. Start with 1 replica, it generates and creates a secret with key-1.
  2. Secret gets deleted.
  3. Scale replicas to 2. New replica creates secret with a new key-2. It does not have information about old key-1.
  4. Request goes to 2nd replica. Token encrypted with key-1 can not be decrypted with new key-2. User is logged out.

Yes, deleting state disrupts rolling update. The same thing would happen with watch (unless you had replica-1 repopulate the secret with potentially old keys, which I wouldn't expect if the secret is supposed to be the authoritative shared state).

Should we implement behaviour described by @liggitt

Moving to polling seems reasonable for such a slow-moving object, especially given the security tradeoff of granting complete access to all kube-system secrets. You could even do a rate-limited re-poll if a decode error was encountered to stay responsive to key changes on demand.

move it to another namespace (can Dashboard be cluster-service then?)

That would be ideal, but I think the add-on manager only targets the kube-system namespace today

do @floreks concerns sound reasonable and there is another way to go?

In order for existing user sessions to continue working, and preserve the ability to scale up/down replicas, you have to keep old decrypting keys available in shared state (in the secret) as long as your user sessions last.

@liggitt
Copy link
Member

liggitt commented Nov 28, 2017

/retest

@liggitt
Copy link
Member

liggitt commented Nov 29, 2017

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 29, 2017
@roberthbailey
Copy link
Contributor

/approve no-issue

@enisoc
Copy link
Member

enisoc commented Nov 29, 2017

This has been approved for an extension until the end of Dec 1.

@enisoc enisoc added priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. status/in-progress and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Nov 29, 2017
@enisoc enisoc added this to the v1.9 milestone Nov 29, 2017
@k8s-github-robot
Copy link

[MILESTONENOTIFIER] Milestone Pull Request Current

@bryk @lavalamp @liggitt @maciaszczykm @mikedanese @roberthbailey @zmerlynn

Note: This pull request is marked as priority/critical-urgent, and must be updated every 1 day during code freeze.

Example update:

ACK.  In progress
ETA: DD/MM/YYYY
Risks: Complicated fix required
Pull Request Labels
  • sig/ui: Pull Request will be escalated to these SIGs if needed.
  • priority/critical-urgent: Never automatically move pull request out of a release milestone; continually escalate to contributor and SIG through all available channels.
  • kind/feature: New functionality.
Help

@bryk
Copy link
Contributor

bryk commented Nov 30, 2017

@floreks @maciaszczykm Can you work on getting approvals from
hack/OWNERS
pkg/routes/OWNER?

@floreks
Copy link
Member

floreks commented Nov 30, 2017

/assign @lavalamp

Could you take a look?

@bryk
Copy link
Contributor

bryk commented Nov 30, 2017

ACK. Needs OWNERS approval
ETA: When get LGTM
Risks: none

@maciaszczykm
Copy link
Member Author

@deads2k @sttts @lavalamp Could one of you take a look?

@maciaszczykm
Copy link
Member Author

ACK. Needs OWNERS approval
ETA: When get LGTM
Risks: none

@deads2k
Copy link
Contributor

deads2k commented Dec 1, 2017

The apimachinery is no worse than it was before. Thanks for noting it will be removed in a later release.

/approve

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 1, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k, liggitt, maciaszczykm, roberthbailey

Associated issue requirement bypassed by: roberthbailey

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/ui Categorizes an issue or PR as relevant to SIG UI. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet